Categories

Versions

You are viewing the RapidMiner Studio documentation for version 10.0 - Check here for latest version

Stem (Dictionary) (Text Processing)

Synopsis

Replaces terms by pattern matching rules.

Description

Reduces terms to a base form using an external file with replacement rules. The file must contain a rule per line: targetExpression : patter1 patter2 ... where targetExpression is the term to which the input terms are reduced, if it matches any of the patterns. patterX is a simple string or a regular expression. A simple example would be a mapping like: weekday : .*day Please keep in mind, that very short words are filtered out in the default setting of the TextInput operators.

Input

  • document

    The document port.

  • file (File)

    The file port.

Output

  • document

    The document port.

Parameters

  • fileFile that contains the dictionary. See operator reference for the file format. Range: